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Abstract. We prove the existence of a limit of the finite volume probability 
measures generated by tree growth rules in Ford's alpha model of phylogenetic 
trees. The limiting measure is shown to be concentrated on the set of trees con- 
sisting of exactly one infinite spine with finite, identically and independently 
distributed outgrowths. 

1. Introduction 

Graphs are used in many fields of science to describe relationships between in- 
dividuals and to model actual physical objects. The former case includes social 
networks [2], phylogenetic trees [HI [131 [14] , the world-wide web p] and much more. 
The latter case includes discrete objects such as macromolecules pU] and branched 
polymers [2]. The graphs can also serve as discrete approximations to inherently 
continuous objects, an example of this being triangulation of manifolds in quantum 
gravity, see e.g. [4]. 

Random graphs are commonly used to describe real deterministic networks. In- 
teractions and relations in the networks can be complicated but their characteristics 
are in some cases captured by random graph models, defined by simple rules which 
are motivated by the nature of the real network. The alpha model, introduced by 
D. Ford in [13J, is an example of a random graph model, intended to describe phy- 
logenetic trees. It is a one parameter model of randomly growing, rooted, planar, 
binary trees with the following growth rules. Start from a single rooted edge and 
from a tree on n leaves, select individual internal edges with probability weight a 
and individual leaves with probability weight 1 — a where < a < 1. Graft a new 
leaf to a selected edge and thus generate a tree on n + 1 leaves, see Fig. [TJ 

Ford proved that the model is Markovian self-similar which means informally 
that a subtree below an edge is distributed identically to the whole tree, a more 
precise definition will be given in the main section. He also showed that typical 
distances in the trees scale as n a with the system size n. The Hausdorff dimension 
of a randomly growing tree is defined to be dn given that typical distances scale as 
n i/d H ^ Thus, in the alpha model dn — 1/a. 

In a recent paper [14] the continuum limit of the model has been established in 
the context of fragmentation processes [5]. A generalization to multinary trees is 
introduced in [7] in the alpha-gamma model where in addition to the growth rules 
of the alpha model, edges can be grafted onto vertices, increasing their degree. The 
alpha-gamma trees are shown to be Markovian self-similar and a continuum limit 
is established. 

Our motivation to study the alpha model comes from the fact that it is a certain 
limiting case of a model of random trees which grow by vertex splitting, introduced 



in [8] where the relation is explained. In general the vertex splitting model does 
not share some of the technically convenient properties of the alpha model such 
as Markovian self-similarity, and it is more difficult to do exact calculations. The 
hope is that some of these properties might hold asymptotically for large trees and 
therefore a good understanding of the alpha model could be helpful. 

The purpose of this paper is to establish convergence of the finite volume mea- 
sures generated by the alpha model to a measure on infinite trees. For < a < 1, 
the infinite measure is shown to be concentrated on the set of trees consisting of 
exactly one infinite path from the root to infinity (spine) with finite, identically and 
independently distributed outgrowths. 



We start with a few definitions before presenting the model. In this paper we 
only consider rooted, binary, planar trees. Rooted means that we mark a single 
vertex of degree 1, binary means that vertices are only allowed to have degree 1 or 
3 and the planarity condition distinguishes between left and right branchings. The 
root and vertices of degree 3 will be referred to as internal vertices and vertices of 
degree 1, besides the root, will be referred to as leaves. Denote the set of trees on 
n leaves by T n and denote the set of all finite or infinite trees by T. 

The alpha model is defined by probability distributions 7r ajn on T„, for n > 1, 
constructed in the following recursive way. Assign probability one to the unique 
trees in T\ and T 2 . Given 7r Qj „ for some n > 2, 7r Qj „ + i is generated by first selecting 
a tree r G T n according to ir a . n . Next an individual edge (a, b) is selected from 
r with probability a/(n — a) if a and b are internal vertices and with probability 
(1 — a)/(n — a) if one is an internal vertex and the other a leaf. The edge (a, b) is 
removed from r and two new vertices c and d are introduced along with the edges 
(a,c), (c, 6) and (c,d). Equal probability is assigned to left and right branching of 



Figure 1. The grafting process. The link (a, b) is selected with 
probability weight a and the link (a', b') is selected with probability 
weight 1 — a. The selected link is removed, two new vertices c and 
d and three new links are added as shown in the figure. In this 
example, a is the root which is indicated by a dashed line. 

the new edge (c, d). One can think about this procedure as grafting a new edge to 
an existing edge in r, see Fig. [TJ The probability of a tree r' € T n+ \ is thus given 

2 



2. Convergence of the finite volume measures 



a 




-• 



by _^ 

7r a ,„+i(r') = J2 ^,«MP(t -> r') (1) 

r6T„ 

where P(r — » r') is the probability of growing the tree r' from r by the grafting 
process. 

The model has a property called Markovian self-similarity [13J which is essential 
in the inductive proof of the theorem in this paper. Markovian self-similarity means 
that there exists a function q a (-, ■) such that for every finite tree to which branches 
at the nearest neighbour of the root to a left tree t\ and a right tree t 2 (see Fig. [2]) 
the following holds 

(ro) = Qa{\ T l\j \ T 2 I ) 7r a,|Ti I (TL ) 7i 'q, |t 2 | (t 2 ) (2) 

where |r| denotes the number of leaves in a tree r. In words, this says that 




Figure 2. An example of a tree tq which has a root indicated by 
the dashed line. The tree To branches at the nearest neighbour of 
the root to two subtrees, t\ to the left and r 2 to the right as is 
indicated by the dotted lines. 



<7a (711,712) is the probability of a tree branching to subtrees of sizes m and n 2 . 
Furthermore, given that the subtrees are of these sizes they are distributed inde- 
pendently by ir a ni and 7r Q „ 2 . The function q a is explicitly known [13] and is given 
by 



q a (ni,n 2 ) 



n\T a (n 1 )T a (n 2 ) / a 
ni\n 2 \T a {n) \2 



(1 — 2a)ni7i2 
n(n — 1) 



where n — m + n 2 , 

T a (n) = (n — 1 — a)(n 



2-a)---(2-a)(l-a), and r a (l) = 1. (3) 



Before proceeding to the theorem we give a short explanation of what is meant 
by convergence of probability measures. For a tree r G T let Br{t) be the subtree 
of t which is spanned by the vertices at distance less than or equal to R from the 
root of r. Define a metric d on T by 



d(r, t ) = inf 



1 



1 + R 



Br{t) =5 fl (r , )| 



(4) 



For some properties of the metric space (T, d) see [S1[TT]. We will establish weak 
convergence, as n — > 00 of the measures 7r Qi „ viewed as probability measures on T, 
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to a probability measure n a . This means that for all bounded functions / which 
are continuous in the topology generated by the metric d 

/(r)d7r a) „ — > / f(r)dK a , as n — > oo. (5) 
it Jt 

Theorem 2.1. Let < a < 1. The measures ir a , n , viewed as probability measures 
on T , converge weakly, as n — ► oo, to a probability measure ir a on infinite trees 
which is concentrated on the set of trees with one infinite rooted spine with finite 
outgrowths i.i.d. by 

= Tn — n a,\T\(r). (6) 



The proba 



of right and left branching of outgrowths are equal (see Fig. 















Figure 3. The infinite spine with finite ^ Q -outgrowths. 

Proof. We call the maximum graph distance from the root to a leaf in a tree, 
the height of the tree. Let T^' be the set of rooted trees of height R. The metric 
space (T, d) is compact and therefore it is sufficient to show that for any R > 1 and 
any tq £ the sequence 

n<x,n({T\BR(T)=To})=:nW(To) (7) 

converges to a limit 7r« (to) as n — > oo [11]. We show this by induction on R. 
For R = 1 this is trivial since B\{t) E T^ for all r. Now assume that for some R 



and all r E T( R ) (t) converges as 



oo. Choose a tree tq E T^ r+1 ^ and 



without loss of generality, assume it branches at the nearest neighbour of the root 
to a left tree n E T (H ^ and a right tree r 2 E T( s ) (see Fig. EJ where 5 < R. Then 



711+7*2=71 

v r a (m)r a (n 2 ) (fl) 



ni+712— n 



1 - 2a v r ° (ni)r " ( " 2) ^(«) r Tl w(H) ( T9 ) 

*(n-i) ^ (ni-l)l(n2-l)l a ' niy ja ' n2K) 



-(-- 1 )„ 1+ ^ = „K- 1 ) ! (- 2 -l)! 

(8) 

If S < R then 7ri fl 2 2 (r 2 ) = when n 2 > ^(t 2 ) and it is obvious from the induction 
hypothesis that (to) converges. Therefore assume that S = R. 

Note that in ([Sj) it always holds that either m < n — 1 and n 2 < n or n 2 < n — 1 
and ni < n. Therefore we have the upper bound 
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(R+l)f \ ^ n - r a("l) r a(^2) 

<n (T0) -f^) + ^ n^! • 

Terms in the sums in ([8]) for which ni > § and n2 > A or ri2 > § and ni > A 
where v4 > 1 is some constant are therefore bounded from above by 



2n!_ r a ( ni )T a (n 2 ) 2n!r a ([n/2]) T a (n 2 ) 

TJn) ^ ni!n 2 ! ~ T a (n)\n/2]\ ^ n 2 \ 

ay ' ni +n 2 =n 1 i n 2 =A Z 

ni >n/2,n2 >A 



<cf^ A= ^ o (9) 



where C is a constant. The remaining contribution to (J5]) is from terms where 
n i > § an d n 2 < ^4 or ri2 > § and n\ < A. Notice that the second term in that 
contribution to J8]) will be of one order lower in n than the first term. Therefore 
it is enough to show that the first term converges asn->oo since then the second 
term clearly converges to zero. The contribution to the first term is 

E r (Bi)r„(n2) ( 8) , \ (r) / \ 



n\ a \ 



Y„(n) 2 ^ ;/i !/; :2 ! 

v ' i=l n 1 +n 2 =n 



i=l m=l 



2 

i— 1 m— 1 



(10) 



In the first step we used the induction hypothesis. This is the limit of 7^^^ (to) 
as n — > oo. The fact that the sum in ([9]) converges to zero as A — » oo proves that 
the measure is concentrated on the set of trees with exactly one infinite spine. The 
last sum in (fTO)) shows that the distribution of the finite outgrowths is given by \i a . 

□ 



3. Conclusions 

We have shown that the finite volume measures 7r Q „ generated by the growth 
rules of Ford's alpha model converge, as n — > oo, to a measure on infinite trees. The 
limiting measure is concentrated on the set of trees consisting of exactly one infinite 
spine with finite outgrowths, independently distributed by fi a . The emergence of a 
single spine is well known from models of size conditioned critical Galton Watson 
trees [12]. The case a = 1/2 is in fact a special case of such a tree. However, in the 
vertex splitting model it is possible that an infinite number of spines emerge. This 
happens for example in the special case of the preferential attachment model [9| and 
in the case a = in the alpha model. In both these cases the Hausdorff dimension 
is infinite. One interesting question is whether a finite Hausdorff dimension is 
equivalent to the emergence of a single spine and whether an infinite Hausdorff 
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dimension is equivalent to the existence of infinite number of spines in the vertex 
splitting model. 

An obvious next step is to use the formula for the limiting measure to calculate 
some global properties of the alpha trees such as the Hausdorff dimension and the 
spectral dimension. The Hausdorff dimension of an infinite random tree given by a 
probability distribution v is defined as dn if 

(V R ) V ~ R dH (11) 

where Vr(t) is the number of edges in a ball -Br(t) and (-) v denotes expectation 
with respect to v. The above definition should coincide with the one given by the 
scaling of a typical distance in a finite tree as discussed in the introduction. This 
will be checked explicitly in a forthcoming paper. 

The spectral dimension of an infinite random tree as above is defined as d s if 

(P(t)>* ~ t- dJ2 (12) 
where p T (t) is the probability that a simple random walk which starts at the root 
of a tree r at time t = is back at the root at time t. The techniques used in [12] 
give a way to estimate the spectral dimension of the alpha model from knowl- 
edge of the large R behaviour of the quantities (\B R \) tla , Mq{ t I height of r > R} 
and {\Bp l \~ 1 )^ a . Using the formula for the limiting measure and the Markovian 
self-similarity properties of the outgrowths one can write recursion equations for 
generating functions of these quantities. Preliminary results indicate that indeed 
dji = 1/ct in agreement with the finite scaling definition and d s — 2/(1 + a). In 
the case a = 1 this is trivially true and in the case a = 1/2 the result is known to 
be true by connection to Galton Watson trees [12] . For other values of a this has 
not yet been proven. 
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